Python and Matplotlib Essentials For Scientists and Engineers-Morgan & Claypool (2015)
Python and Matplotlib Essentials For Scientists and Engineers-Morgan & Claypool (2015)
Matt A Wood graduated with a BS degree in physics from Iowa State University, and
Master’s and PhD degrees in astronomy from the University of Texas at Austin. He spent
a year as a NATO postdoctoral fellow at the Université de Montreal in Quebec before
accepting a position as assistant professor at The Florida Institute of Technology. He spent
the 2008–2009 academic year on sabbatical at Radboud University in Nijmegen, The
Netherlands, where he was first introduced to the Python programming language. In 2012
he joined the Department of Physics and Astronomy at Texas A&M University-Commerce
as department head. His current research focuses on mass-transfer binary star systems
known as cataclysmic variables. He has been an author on more than 80 peer-reviewed
publications and a similar number of non-refereed publications. He lives in Greenville,
Texas, and when not doing astronomy or administrative tasks he enjoys playing guitar and
bass, walking his doberman Dexter and exploring the world with his wife Janie.
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 1
1.3 Resources
There are now countless books and websites devoted to Python and associated packages.
The webpage wiki.python.org/moin/PythonBooks includes a long list of book titles sorted
by category, as well as links to reviews. Some of the specific books that I have found
useful in preparing this monograph include:
Langtangen H P 2012 A Primer on Scientific Programming with Python 3rd edn
(Berlin: Springer)
Downey A 2012 Think Python: How to Think Like a Computer Scientist (Needham,
MA: Green Tea)
Fangohr H 2014 Introduction to Python for Computational Science and Engineering
(free download at www.southampton.ac.uk/~fangohr/software).
Many online resources exist as well. To list just a few that I have found useful:
python.org is of course the definitive Python resource on the web. A good starting
point is the Beginner’s Guide at wiki.python.org/moin/BeginnersGuide.
stackoverflow.com is a great question and answer site for programmers. Users vote
up the best answers, so they show up first and are easiest to find.
The Python course available at www.python-course.eu/index.php is very
comprehensive and includes many tutorials and examples.
Google has a Python class available at developers.google.com/edu/python/. The class
includes text, lecture videos and many coding examples.
1www.mathworks.com/products/matlab
2www.exelisvis.com/IDL
3
www.astro.princeton.edu/~rhl/sm
4www.gnuplot.info
5www.gnu.org/software/octave
6See
www.opensource.org for the open source definition.
7Source:
xkcd.com/353.
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 2
First steps
3.1 Working with strings
If you need to, you can always escape quote marks with a backslash (\), and if you
need to include comments, just lead with a hash sign at the prompt or in your programs:
We can also the list elements back into a single string with spaces as a
separator:
Next, we need our list of integers, which we obtain with the function3 The
form of is where if omitted
defaults to 0 and if omitted defaults to 1. If all three arguments are present, the
function returns a list of integers
. If is positive, the last element is the largest less than
. If is negative, then the last element is the smallest
greater than . For example
Finally, we put it all together with a loop. Note that when entering these
commands, you will need to indent the lines beginning with and to
indicate they are within the loop. Python simply uses the indentation level to indicate
which lines of code belong in a given block—curly brackets or statements are
therefore not required. The standard indentation is four spaces, but you can use whatever
you like, as long as you are consistent within a given block:
Note that here we are importing the module to pass the command line arguments.
is the program name and is our command line argument.
The total number of command line arguments is given by .
Now would be a good time to save this code to a file called 8, after which
you can run it using the command
but for a program you will use frequently, it will be more convenient to make it an
executable. In Unix-like systems, this is accomplished with the (change mode)
function, where makes a script executable directly from the command line:
The IPython shell numbers your commands for later use. In much of the rest of this
book, I will continue to show the standard Python prompt for examples, but when
you are actually doing your own work, my strong recommendation is that you use IPython
for everything (or an IDLE or IPython Notebook) whenever you are developing code.
You can also run an external program from the Python prompt if IPython is not
available. If there are no command line arguments, then you can use the
function
If you need to pass command line arguments, then things are not so straightforward:
1In
this book, text that is in represents text that you enter or that is returned by the computer. Text
that is entered is colored black, and text that is returned is colored dark blue. Code snippets are in ivory-colored boxes
with light gray borders and complete standalone programs are in ivory-colored boxes with gold borders.
2
The functions and are methods of the class. Methods are called using dot notation, for
example . Methods, including how to define them, are discussed more fully in chapter 9.
3Python 2.x includes the function which is an iterator object that returns the integers one at a time and so
conserves memory when the calling argument is a very large integer. In Python 3.x, is an iterator object that
behaves like in Python 2.x and the original is depreciated.
4This code snippet also gives a sneak peak at the statement and flow control, discussed further in chapter 7.
5The
link defaults to Python 3.x documentation, but the 2.x tutorial is just a click away.
6code.google.com/p/spyderlib/
7See
ipython.org/notebook.html.
8Example
codes named in the text are available for download at pythonessentials.com.
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 4
Note the first two lines give the expected results, but that yields when in the
Python 2.x interpreter, because the result was rounded down to the nearest integer (floor).
This results because Guido ‘BDFL’ van Rossum adopted the typical rule from C (also true
in Fortran) that the result of an equation is always of the same type as the operands. So,
divide a float by a float, you obtain a float; divide an integer by an integer, you obtain an
integer. The former case is fine, but the BDFL now considers the latter to be a design
bug1. Although there are many codes out there that rely on this behavior, for the rest of us
it is an annoyance we have to know about and avoid. The good news is that Python 3.x
implements true division for both integers and floats2, so will return . Python 2.x
allows you to execute the command to
yield this same behavior. If using Python 2.x and you have not run this command, you can
simply put a decimal point after at least one of the integers in a division operation, which
will raise the result to a floating number:
The order in which operations are completed in a statement with multiple
mathematical operators is similar to other programing languages and can be remembered
with the acronym PEMDAS (Parentheses, Exponentiation, Multiplication/Division,
Addition/Subtraction).
Variables are easy to assign and work with. The statement
is an assignment statement. What happens when this is entered into the interpreter is that
Python creates the float object and binds the name to that object.
In computer programs it is very common to iterate and so lines of code of the form
are common. As an algebraic statement, this makes no sense, but what this line of code is
telling the interpreter (or compiler in a compiled language), is to evaluate the expression
on the right-hand side of the equals sign and then assign the resulting value to the variable
name on the left-hand side of the equation. Because this is such a common operation,
Python has available the compact notation so, for example,
When in interactive mode, the previous result is available through the variable -. This
can be very useful when using Python as a calculator.
The modulus function can be very useful in certain situations (for example, printing
diagnostics every 100 time steps of a simulation):
Adapting our code from section 3.1.5,
It is probable that you will often want to use the value of π in your programs. You can
enter it yourself, but if you execute at the command line or at
the top of your program, you will have it available to machine precision when you need it.
is an example of a module and modules are extensions to Python that can be
imported to extend the capabilities of the base language:
The math module provides access to the standard mathematical functions defined by
the C standard and you have the option to simply include everything in the module with
at the top of your programs, although as discussed below it is
generally safer to just import what you need to avoid conflicts in the namespace:
Here is a selected list of a few useful functions from the module:
Complex numbers are also straightforward to work with, where indicates the
complex part of the value:
4.2.1 Lists
Python includes several compound data types. The list is a very useful sequence construct.
It can be written as a list of comma-separated values between square brackets:
Note that the elements of a list do not have to all be of the same type—in the example,
we have strings, an integer and a float all in the same list.
You can add elements on to the end of your list, replace elements within your list,
remove items from your list, insert some, reverse the list, or clear the entire list:
As noted briefly in the previous chapter, strings can be concatenated with the + sign:
Lists can also contain other lists, which is useful for some applications:
4.2.2 Slicing lists
Lists, like strings, can be sliced and indexed as needed:
which is equivalent to
4.2.4 Tuples
Python also includes a data type called a tuple (in fact, our most recent example returned a
list of tuples). Tuples, like lists, are sequences. What is special about tuples is that they
cannot be changed—they are immutable (lists are mutable). You might think of a tuple as
a ‘constant list’. Tuples are indicated by simply separating some values with commas and
are often enclosed in parentheses:
The empty tuple is written as a . You can slice tuples just as you can lists and you
can convert a list to a tuple or vice versa if need be:
Tuples are used behind the scenes in Python and you may use them when calling
functions. Again, they work very much like lists, with the exception that they cannot be
changed.
The function iterates over two or more sequences or iterables in parallel. Most
commonly, it takes two or more lists and returns a list of tuples, where the ith tuple
contains the ith element from each of the argument lists:
To make a copy of a list that does not refer to the same object, you can use
or for simple lists3:
There may be situations where your list itself contains other objects like lists or class
instances. If you have such a situation, you can use
. See section 9.3 below for more on when you would
need to make a deep copy.
That is probably not what you expected! What you obtained was three copies of
concatenated together. You can achieve the behavior you want with the following list
comprehension:
This works, and you can use similar constructions for other arithmetic operations, but
is a bit cumbersome. In science and engineering disciplines, we are mostly going to be
dealing with arrays of numbers, which are not included as a core feature in Python. So, let
us introduce the package NumPy, which you will probably import at the beginning of
nearly all of your codes that work with numerical data.
1See
python-history.blogspot.com/2009/03/problem-with-integer-division.html.
2Integers
and floating point numbers (floats) are stored differently in the computer memory.
3
For more information on the module, see docs.python.org/2/library/copy.html.
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 5
NumPy arrays
5.1 Creating and reshaping arrays
We just saw that the default Python behavior for dealing with lists of numbers was not
what we really wanted. The NumPy module, however, will give us the tools we need.
NumPy’s main object is the array, which is a table of elements all of the same type, with
an arbitrary number of dimensions (or axes) as needed. The number of dimensions or axes
is called the rank of the array. For example, the coordinates of a single point in 3D
space (for example, ) is represented by an array of rank 1 and that
axis has a length of 3. An array that holds the coordinates of four point masses
might be given by
This is a 2D (rank 2) array. In this example, the first dimension (axis) has a length of 4 and
the second has a length of 3.
We can access rows, columns and individual elements in the array just as we did for
lists using indexing and slicing:
The array class is called . You can create an array from an existing
(numerical-only) list using
to accomplish the same task, experienced Python programmers usually advise against this,
as it introduces many other definitions into your current namespace and this can
potentially cause conflicts. For example, if you first import another module that also
defined an object and then used the new definition
would override the other. The use of qualified names (e.g. ) helps to avoid
such collisions and also helps make clear where those definitions are coming from.
You can easily find and change the attributes of your arrays using methods of the class
Arrays must be homogeneous, meaning all values have the same data type. So if you
enter a mix of integers and floating point numbers the integers will be converted to
floating point values and if there is a single complex number input all entries will be
converted to complex numbers:
5.1.1 NumPy
Note that works like the function we saw in section 3.1 above, but
returns an array instead of a list. As with you specify at a minimum
the value. You may not want your sequence to start at zero, or you may want your
sequence to be floating point values and/or you may want to count down instead of up. As
with these requirements are easy to accomplish:
5.1.2 NumPy
As a general rule, it is not a good idea to use with floating point values,
because uncertainties in how the floating point precision will round the numbers means
there may or may not be an extra value tagged on at the end (see section 5.6). It is much
better practice to use the function that takes as an argument the
number of elements to return, instead of the desired step size. Also, unlike or
, the function fills the array to include both end points. If this is
not what you want you can pass as a keyword argument:
where the last statement gives standard matrix multiplication if and are
2D arrays. If instead and are 1D arrays (i.e., vectors) then returns the
standard inner product of the vectors (without complex conjugation). The function
returns the cross product of vectors and :
You may sometimes want to perform operations and overwrite the original array:
It will often be useful to find the minimum or maximum value of an array. The array
class provides methods and that return these. Having already imported
, we will use the function to return a list of pseudo-random
numbers in the half-open interval [0.0, 1.0) with a uniform distribution2:
It might be that you need the maximum or minimum of a given row or column, in
which case you could specify which axis to search:
5.3 Dictionaries
The Python dictionary object provides a very flexible means of storing information.
Perhaps you have a list that has the mass densities in units of g cm−3 for selected
substances:
For this to be useful we need to know what substance each of these list items represents.
This is where a dictionary can be useful. A dictionary object can be created as follows
using curly brackets {} and key-value pairs (or simply items) each separated by a colon
where, in our example, the substance name is the key:
Note that the printed order of the key-value pairs is not the same as what we input,
because the information is stored as a hashtable. If you enter the same statements on your
computer, the order may be different from that above. This behavior is not a problem
because the dictionary is not accessed using an index as a sequence is, but rather by the
key.
With the above definition for , we can retrieve the density of ice using a statement
We can add to the dictionary and print it, and can return the length of the dictionary
using :
The keys and values can each be extracted into new lists:
Round-off error is a fact of life when programming and is the reason why it is best to
avoid comparing floats as equal in conditional statements. The following example code
would seem to print the numbers from 0.0 to 0.9 in increments of 0.1 and then stop when
t = 1.0. The actual behavior is that the conditional expression never tests as
and so the loop is infinite. The built-in function returns a string containing
the full (‘official’) string representation of an object, whereas the function returns
an ‘informal’—potentially less accurate—string representation of the object:
Rather than testing for equality, it is much safer to check that you have reached the target
value within some tolerance. For example, the following code terminates at , as
intended:
The module contains the routines, which are optimized for linear
algebra. For example, it is trivial to solve a matrix equation of the form ax = b for vector x.
If you are working with very large matrices, you should consider instead using the
module, because typically SciPy is built using the optimized
ATLAS3 LAPACK and BLAS libraries, which results in very fast linear algebra
performance. However, in this case you will need to use the class instead of the
class.
We have not discussed SciPy up to this point, but it is worth mentioning that
essentially everything available in NumPy is also available in SciPy. Often the routines are
identical, but when they differ the SciPy routines are usually faster. To quote the SciPy
FAQ4:
In an ideal world, NumPy would contain nothing but the array data type and the
most basic operations: indexing, sorting, reshaping, basic elementwise functions, et
cetera. All numerical code would reside in SciPy. However, one of NumPy’s
important goals is compatibility, so NumPy tries to retain all features supported by
either of its predecessors. Thus NumPy contains some linear algebra functions,
even though these more properly belong in SciPy. In any case, SciPy contains
more fully featured versions of the linear algebra modules, as well as many other
numerical algorithms. If you are doing scientific computing with Python, you
should probably install both NumPy and SciPy. Most new features belong in SciPy
rather than NumPy.
1MATLAB,
for example calculates the matrix product when the operator is used.
2
If you run this example, your numbers will differ from what is shown.
3math-atlas.sourceforge.net
4www.scipy.org/scipylib/faq.html
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 6
The command will return the entire file as a single string (including the
newline character ) if no argument is passed, or just the number of bytes passed as an
argument, which for example can be useful for reading binary files. The
command resets the current position back to the beginning of the file and closes
the file:
The file can alternatively be read using , which returns a list
containing the lines in the file:
This program takes the file name from the command line, iterates on the file itself and
iterates within the line. Indeed, this can be made even more compact with the use
of a nested list comprehension:
and for example if we know there are two columns of data in our file and we want to put
these into array objects, we could add the lines
Now that we have discussed the ‘hard’ way to accomplish this, let us discuss the easier
path that affords for reading files containing numerical data.
where you will notice that the first two lines are comments serving as column headers.
To read this file, all we need do is
Note that because the first two rows of the file start with the comment symbol they
are ignored by . If you have some number of rows at the top of your file that
you want to skip but that do not start with , you can simply include the
keyword when calling . When using comment lines are included
in the count of skipped lines, so it will behave as you want it to, no matter if the rows you
want to skip begin with or not.
For example, given our CO2 data file of monthly averages, if we wanted to not read in
the first ten years of data, we would need to skip the two header lines and 120 data lines,
for a total of 122 lines:
Here column 1 is the year, column 2 is the month, column 3 is the decimal date, columns
4, 5 and 6 are different estimations of the CO2 concentration averages and the last column
is the number of days going into the monthly average.
If we want to read this file directly, we can simply read everything with
We could then copy the data of interest (the 3rd and 4th columns) to two 1D arrays as
follows:
but provides a more direct solution. If we just want the decimal date and the
direct average concentration, we can use the keyword to specify which of these
columns we want to read, where the index starts at zero for the first column. Even more
useful, we can the data and load them directly into 1D arrays that we can later
pass to (for example) the Matplotlib functions:
Then you can use methods of the module to return useful information or
to reformat the date and time using 4:
you can read this file and examine the results using the following:
Note that the function was able to determine that the first line
contains and that the data types of the four columns were , ,
and , respectively. The full utility of the data table object is
beyond the scope of this text, but note that the column names can be accessed via
Tables, like lists, are mutable so data in them can be changed in place, and rows and
columns can be deleted or added as needed. As noted above, from the
Astropy library can also read LaTeX tables directly, so if we had found our data in the
LaTeX source code of a paper on and saved it to a file on our local disk as
we could read that file using the following and obtain exactly the same table object as we
obtained above. A related function discussed below that may be useful to you is
, which can write your data as a LaTeX table6.
and then can easily print out the equivalent Julian date, modified Julian date, or convert to
another time scale. If you do not need this functionality, then jump to the next section, but
if you do need this functionality, then something like this module may have been on your
wish list for years. See the documentation for more information and send your thanks to
the developers.
You have control over the precision of the printed output with the
attribute, which gives the number of digits after the decimal point when outputting a value
that includes seconds. The default is three and the maximum precision is nine:
Finally, note that the Astropy module can also read and write files using the binary
HDF5 and FITS formats, as we discuss in section 6.2.4 below.
If you have programmed in any other languages, the format codes for this method
should be easily understandable. The general form is
, where
So, the code means print a floating point number with a width of four
characters in the format (the decimal point counts as one character). The code
means print an integer with three character spaces, including a sign if negative. There are
additional specifiers not listed here for binary, octal and hexadecimal numbers.
Although it might appear that the formatting is part of the function, this is not
the case. Instead the string object is acted upon by the modulo operator, which returns
another string, and it is this returned string that is passed to the print function:
which perhaps is not terribly clear, so let us convert our examples from above to the new
method:
As before with the string modulo operator, we again have a format string on the left
which has fields that will be replaced, however, here we indicate these fields with curly
brackets . The curly brackets and any format codes within will be replaced by the
formatted value of one of the arguments to the object. In the examples
above, the positional arguments , and were explicitly stated, along with
format codes. If the arguments are in the same order as you want things printed, then you
can leave them out. Similarly, if you do not care about the exact formatting of the
arguments, you can also leave out those specifiers:
However, if you want to use the arguments in a different order, or if you want to use an
argument more than once, then you do need to specify the positional parameters
You may have noticed that the general syntax for allowed keyword
arguments. This feature could be quite useful for complicated print statements, as it makes
it easier to map from the arguments back to the format string:
but if you employ unicode characters you can output the ‘±’ to the terminal (or file). The
unicode character for the ‘±’ symbol is . If you include this in your string with a
‘u’ in front of the string to tell Python to interpret the string as unicode, you can obtain the
result in this form using
which outputs a=5.34±0.02 to the terminal or your file. Note, it may still be preferable to
use the ‘ ’ combination depending on your application. A full list of unicode characters
is available at unicode-table.com.
Printing integers with commas
Finally and somewhat randomly, if you are printing large integers, you might prefer to
make them more easily readable for humans. Python lets you use a thousands separator:
After executing either of these statement blocks from the interpreter (or a file), your
working directory will contain the file with the following lines:
If you have numerical data that you want to write in columns to a file, there are several
ways you can do this, four of which are shown in the example below
( ). All four give identical output. The first example is the most
straightforward and perhaps the first thing you would think of if coming to Python with
previous experience in C/C++ or Fortran. Example 2 brings the loop inside a list
comprehension, saving one line (‘Flat is better than nested’). Examples 3 and 4 both use
the statement, saving another line. These use and
, respectively, where writes a single line to a file and
writes a sequence of strings to the file. The
method requires the entire sequence to be created in memory before writing to the file and
so example 4 is less memory efficient than example 3. Therefore, of the examples shown,
I recommend example 3 as the best, however, NumPy provides the function,
which is in practice what you will probably use to write columns of numbers to a file.
6.2.3 NumPy
Saving an array to a text file is straightforward using the function.
To save your two-column data to a file you could simply enter
which will output a file containing the values on the first line of the file and the values
on the second line of the file, with all values by default printed in exponential format to
machine precision, which is not convenient:
We can write our arrays in columns by including in the
call to and we can format our data to the appropriate number of significant
figures by including the keyword argument.
If using the same and arrays we call using
If you would like to include header or footer lines, you can pass strings to the
and keyword arguments. This example also demonstrates that if you only include
a single format specifier, it will be used for all values:
Then contains
If you would like to include multiple comment lines at the top of your file, you can do
something similar to the following:
Then contains
6.2.4 Astropy
As noted above, the Astropy function is very flexible and can also write
your data in several useful formats. If you want to write your file with the column
headings written as a comment line (so the file could be read directly with
):
then the resulting file will contain
There are other available packages for reading and writing HDF5 files in Python,
including the h5py package available at www.h5py.org. Quoting from the website, ‘[t]he
h5py package provides a Pythonic interface to the HDF5 binary data format’ and there
exists the book Python and HDF5 written Andrew Collette, the lead author of the h5py
package. The h5py package provides a lower-level interface to HDF5 files, but may
include features you need that the Astropy package does not.
1www.esrl.noaa.gov/gmd/ccgg/trends/
2See
docs.scipy.org/doc/numpy/user/basics.io.genfromtxt.html.
3
See, e.g., en.wikipedia.org/wiki/ISO_8601 and perhaps also xkcd.com/1179.
4For a list of all the format codes, see strftime.org.
5Visitwww.astropy.org. If using Anaconda Python, installation is as simple as typing at
a terminal command prompt.
6In
the current example, the table above was created with the command
.
7See
astropy.readthedocs.org/en/latest/time. From the documentation, ‘All time manipulations and arithmetic operations
are done internally using two 64-bit floats to represent time… [T]he Time object maintains sub-nanosecond precision
over times spanning the age of the universe’.
8See www.hdfgroup.org/HDF5.
9FITS
stands for Flexible Image Transport System. The FITS format is widely used by astronomers and although a
binary file format, has the advantage that the metadata are included in a human-readable (ASCII) header. See
fits.gsfc.nasa.gov/fits_home.html.
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 7
7.1 Conditionals
Python uses the boolean and objects in conditional statements:
7.2 statements
One of the most basic statement types is the statement to choose between different
code blocks depending on the result of a conditional test. For example,
The statement is short for else if and these, as well as the statement, are
optional.
7.3 loops
In Python, the statement loops over the elements in a sequence. While that sequence
can be a sequence of numbers,
it may be that you want to loop over the elements in a list of strings:
If you need to iterate over the indices of a sequence, you can do so by using the
and functions:
Should you actually want to implement this, you may find it useful to know that you can
find the distance between two vector positions using :
7.4 statements
An alternative method which can be useful is to loop over a sequence while some
condition is and to stop when that condition becomes :
The potential problem is that, if we are not careful, the condition might not be met and
we then have an infinite loop. Generally, it is safer to use loops.
The statement skips the rest of the statements in the current loop block
and continues to the next iteration of the loop. The following example program
demonstrates the use of the , and
statements:
Make the code executable with then run
Some blocks of code will be useful in multiple projects. An example would be the
standard math routines (e.g., , , , etc.). These of course are so
useful that they are part of the distribution of any language. But perhaps your simulation
program writes output files in a specific format and you have several different programs
that you use to analyze and visualize the results. You could cut and paste the lines of code
that read your file format from program 1 to program 2, to program 3, etc., but then if you
change the output format of your simulation program you have to update all of your other
programs. Instead, it is more efficient to put your _ code into a module that
you can into any program. Then if you change the file format, you only need to
change the code in the module—not in each program separately.
Tim Peters posted the following to the on June 4, 1999, with the title
1
The Python Way . It has since come to be known as The Zen of Python and is an ‘easter
egg’ available at any time using at the interpreter prompt:
but of course you will generally want to save your code in files for later use or simply for
a more efficient code development process. A module is simply a file that contains one or
more function definitions and associated statements.
The simple form of a function definition is
The function takes an argument(s) (etc), does something with it (them) and returns
a result. The returned result is called the return value. Note that it is possible for a function
to take no arguments and return no explicit result, but when defining a function you must
include the empty parentheses after the function name, even if you do not pass any
arguments to the function.
The following example shows a common use of the function introduced in the
previous chapter. A valid function definition must have at least one statement following
the definition line, so when developing a program or module, you may use the
statement as a placeholder statement for a function you have not yet written (sometimes
called a program stub), or instead you might opt to include a comment that describes what
the function will eventually do:
If we import and call this code interactively, the interpreter prints the result to the
terminal, but no variables are actually set. In order to set a variable to the result, we have
to call with something of the form
Notice how the output changes between importing the code and running the code from
the IPython interpreter with or the shell prompt:
We could use a keyword argument to make this slightly more interesting. We would
like the default height to be 1 m, and the default acceleration to be the gravitational
acceleration at the surface of the Earth, but would also like the ability to use different
values for the height and acceleration if desired. We then have for our function
We can now call this function without any arguments and the defaults will be used. We
can call it using the keyword arguments and, if using the keyword arguments, the order
does not matter:
Keyword arguments are used a great deal in calls to the Matplotlib plotting functions
discussed in chapter 10 below.
8.4.1 Introduction
Python includes functional programming capabilities in addition to procedural
programming capabilities. Many common programming languages (e.g., C, Fortran,
Pascal, BASIC) are procedural, meaning that the program contains a series of
computational steps to be completed. Procedures (also variously called routines,
subroutines, or functions) can call other procedures, but the net result is a series of
programming statements that are completed one after the other. Functional programming
is also modular in approach, but functional programming languages tend to de-emphasize
or even remove the imperative elements of procedural programming2. Python is one of
several languages (e.g., C++, Java, Perl, MATLAB, Visual Basic .NET) that are multi-
paradigm. Here we will briefly introduce the most useful functional programming features
of Python.
where the brackets ’ ’ and ’ ’ help remind us that the result will be a list object. A simple
example using a list comprehension is
Here is a more complex example using a nested list comprehension that returns a list
of prime numbers from 0 to 49. The list contains all the numbers from 4 to 49
that are divisible by 2, 3, 4, … 7 (many of them more than once, e.g., 12 occurs four
times). The list is created by finding the integers between 2 and 49 that are not
contained in the list:
This method is reasonably efficient for finding small primes, but for finding very large
primes the above code would quickly fill all available system memory with the
list. In such a case, a generator comprehension would be more appropriate, since
generator objects simplify the task of writing iterators and maintain their state between
calls. That is, instead of storing the entire list, generators only return one item of the list
for each time they are called, so are more efficient than list comprehension when the list is
just an intermediate step and does not actually need to be stored.
Here is a trivial example of a generator comprehension statement. Note that the
surrounding parentheses indicate the statement returns a generator object:
For example,
The function can also be applied to more than one list at a time with the
proviso that both lists have to have the same length:
To extend this, we might have a data set for which we would like to remove data
points that are more than some number of standard deviations away from the mean, a
process called sigma clipping. For a true normal (i.e., Gaussian) distribution, we expect
98.7% of the samples to fall between ±3σ. For 1000 numbers drawn from a normal
distribution, we expect roughly three samples to fall outside this range. In the following
example, we draw 1000 samples from a normal distribution with mean 0.0 and σ = 1.0,
then filter those to return the values that are more than 3σ from the mean:
The comma at the end of the statement suppresses the default newline. Note that
the same result can be obtained with a list comprehension which is arguably easier to read:
Should you actually need to sigma clip your data, you can use the function
from . Using the same array from the previous
example, will return an array with the four outliers removed. Note that
this routine is iterative, so after having removed outliers, the mean and standard deviation
of the culled sample are again computed and any outliers removed. This continues until no
outliers remain—use with caution:
1Source:
www.wefearchange.org/2010/06/import-this-and-zen-of-python.html.
2For
a more in-depth treatment of the topics in this subsection, see docs.python.org/2/howto/functional.html.
3
Tkinter is the most commonly used GUI programming toolkit for Python, but there are others. See
wiki.python.org/moin/TkInter for links to more information and tutorials.
4
www.wxpython.org
5www.blog.pythonlibrary.org/2010/07/19/the-python-lambda
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 9
The values of the attributes can be retrieved again with dot notation and we can use
them in any valid expression:
Note that even without having implemented a function to print the values of the
attributes of our instance, we can always print all the attributes of our object using the
module, which ‘pretty prints’ any Python data structure in a form which could be
used as input to the interpreter:
When we run this we obtain the expected result and it is an instance of the
class:
Like lists, class attributes are mutable, so for example we could have a function
that adds to the mass of one of our particles:
Returning to our original definition of the class3, we can bring our print
function into the class definition and make it a print method:
The convention is to use as the first parameter of a method. Note that a method
is invoked using dot notation, as in the example above:
Now when we create , the values of the attributes are set, even if we call the method with
no arguments:
We can create using some or all of the arguments. We can call using positional
arguments or keyword arguments:
The __ __ method is another special method that is designed to return a string
representation of an object. We can modify our _ function to fill this role. The
__ __ method is invoked when you print an object. Within class , we have
This ends our discussion of Python class objects but, as you can imagine, we have
barely scratched the surface of the discussion of classes or object-oriented programming.
For more information, see chapter 9 of The Python Tutorial
(docs.python.org/2/tutorial/classes.html).
1For
an excellent discussion of Python classes and their implementation, see Downey A 2012 Think Python: How to
Think Like a Computer Scientist (Needham, MA: Green Tea) www.greenteapress.com/thinkpython.
2
In many N-body codes, the masses of all particles are identical, which reduces computation time. For more on N-body
methods (and more), see Bodenheimer P, Laughlin G P, Różyczka M and Yorke H W 2007 Numerical Methods in
Astrophysics: An Introduction (Boca Raton, FL: Taylor and Francis).
3
Note: the complete definition of the Particle class as discussed in this chapter is available at the companion website
pythonessentials.com in the file _ .
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 10
and you should obtain a plot that looks something like figure 10.1.
Figure 10.1. A simple line plot.
For a simple point plot (see figure 10.2), the code file ( _ _ )
might be
Figure 10.2. A simple point plot.
Note that to leave a little white space between the plotted points and the plot frame we
have specified the plot axes with the
command, which expects a list as an argument (you do not pass the values directly).
The complete list of arguments to for setting the line/point type and
color is too long to include here, but a useful subset includes
You can also specify grayscale intensities as a string (‘0.6’), or specify the color as a
hex string (‘#5D0603’). You can also specify the .
To plot multiple data sets on the same axis, just call multiple times.
The axes will autoscale to fit all of the data as shown in figure 10.3, but because
does not automatically insert white space between the data sets and the
axes, you will usually want to tweak your final plot ranges manually, as in the previous
example. Note this code ( _ _ ) also demonstrates the use of the
function:
Figure 10.3. A figure showing multiple data sets and a legend.
Note that here we first call to initialize our figure space. This is
optional but good practice. We next specify where the argument
says make two plots vertically and one horizontally, and we are plotting in the first
(top left) plot until we give a new command. If you had, for example, a 2×2
grid of subplots, then the order , , and would be top left, top right,
bottom left and bottom right, respectively.
10.8 3D plots
Note that we use the NumPy function in the previous example. This
function makes it easy to generate a mesh of x–y values from 1D arrays that can then be
used in statements that assign values to a 2D z array. For example:
For the surface plot, let us step up the complexity just a bit by using a color map to
shade the surface according to the z values, making the surface slightly transparent with
, adding a contour plot on the base plane and adding a colorbar to the right of
the plot (see figure 10.11). The following lines of code replace the line beginning
_ in the wire3D.py example:
Figure 10.11. A 3D surface plot with a contour plot base and semi-transparent
surface.
1I
also keep a similar code that plots points instead of lines.
2
For more examples, see the mplot3d tutorial at matplotlib.org/mpl_toolkits/mplot3d/tutorial.html.
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 11
Applications
Python has a substantial body of existing packages that can be imported to efficiently
solve many classes of problems. In this chapter, I highlight just a few of these that I find
particularly useful.
SciPy
Next we present an example of fitting a sine curve to a generated curve with added noise
using the function (see figure 11.3). In this example ( ),
all of the data have the same weight, but it should be obvious from the code how to
included point-by-point weighting if appropriate for your data. The code for this is a bit
long.
Figure 11.3. An example of a non-linear least squares fit to sinusoidal data with
noise.
11.1.3 Linear systems of equations
Both SciPy and NumPy have linear algebra solvers. The SciPy routines may be faster
because they are always compiled with BLAS/LAPACK support, whereas this is optional
for NumPy, and can be much faster if SciPy is built using the optimized ATLAS LAPACK
and BLAS libraries. Solving a system of linear equations is quite straightforward. Say you
have the system of equations
1x+2y+3z+4w=93x+4y+9z+6w=88x+3y+8z+1w=77x+4y+3z+2w=7.
We can solve this by initializing a nested NumPy array to the coefficients on the left-
hand side and to the values on the right-hand side of the equals signs. The code
( ) to solve this particular problem is
The SciPy module contains the module which contains a long list of useful
functions, including which returns an estimate of the power spectral
density using the Lomb–Scargle periodogram. To call this function, we only need to
calculate the angular frequencies and call as follows using our previous definitions:
11.5 Writing sound files
It can be fun and perhaps even useful to turn your data into a sound file. The following
example4 does just that ( ). It reads in two-column data, normalizes to the
range -16384 to +16384, then uses the module functions to create and write a WAV
file:
1See,
e.g., mathworld.wolfram.com/LorenzAttractor.html.
2A
classic and comprehensive text introducing Fourier transforms is Bracewell R 1999 The Fourier Transform & Its
Applications (New York: McGraw-Hill).
3
See chapter 7 of Newman M 2012 Computational Physics (Scotts Valley, CA: CreateSpace) for a thorough discussion
of applications of Fourier transforms.
4Based
on codingmess.blogspot.com/2010/02/how-to-make-wav-file-with-python.html.
IOP Concise Physics
Python and Matplotlib Essentials
for Scientists and Engineers
Matt A Wood
Chapter 12
distribution and is what we will explore in this chapter. If you have the need to interface
with C/C++, see the documentation at docs.python.org/2/extending/extending.html.
Here we will use the specific example of a Fortran subroutine ( in file
) that calculates a simple DFT, as discussed in section 11.4. The subroutine
calculates the normalized amplitudes, such that when passed a noise-free sine curve with
amplitude 1.0, the calculated amplitude will equal 1.0 at the appropriate frequency. The
implementation here does not require that the input data be equally spaced, as is required
when using an FFT.
The equivalent Python implementation of this is function from our module
_ ,
where both implementations return the same values to six decimal places.
Given our Fortran subroutine, we can use to create a module that can be
imported and used:
This creates a file on your system with the basename and an extension that is
the appropriate extension for a Python extension module on your platform (e.g., ,
. , etc). The module is now importable, but all array dimensions must be
declared in the calling function.
In this example ( ) we generate a time series data set of 10 000
points consisting of two periods, of ten and three seconds, of different amplitudes. We
calculate the DFT at 1000 frequency points using both the Python code and the code in the
Fortran-derived module ( ). We use the function from the
module to determine how much time is spent in each of the two DFT routines and find that
the Python DFT function takes some 150 times longer to complete than the Fortran
subroutine:
1See,
for example, docs.scipy.org/doc/numpy/user/c-info.python-as-glue.html.